Goto

Collaborating Authors

 synthesis planning




Double-Ended Synthesis Planning with Goal-Constrained Bidirectional Search

Neural Information Processing Systems

Computer-aided synthesis planning (CASP) algorithms have demonstrated expert-level abilities in planning retrosynthetic routes to molecules of low to moderate complexity. However, current search methods assume the sufficiency of reaching arbitrary building blocks, failing to address the common real-world constraint where using specific molecules is desired. To this end, we present a formulation of synthesis planning with starting material constraints. Under this formulation, we propose Double-Ended Synthesis Planning ($\texttt{DESP}$), a novel CASP algorithm under a _bidirectional graph search_ scheme that interleaves expansions from the target and from the goal starting materials to ensure constraint satisfiability. The search algorithm is guided by a goal-conditioned cost network learned offline from a partially observed hypergraph of valid chemical reactions. We demonstrate the utility of $\texttt{DESP}$ in improving solve rates and reducing the number of search expansions by biasing synthesis planning towards expert goals on multiple new benchmarks.



Fast and scalable retrosynthetic planning with a transformer neural network and speculative beam search

Andronov, Mikhail, Andronova, Natalia, Wand, Michael, Schmidhuber, Jürgen, Clevert, Djork-Arné

arXiv.org Artificial Intelligence

AI-based computer-aided synthesis planning (CASP) systems are in demand as components of AI-driven drug discovery workflows. However, the high latency of such CASP systems limits their utility for high-throughput synthesizability screening in de novo drug design. We propose a method for accelerating multi-step synthesis planning systems that rely on SMILES-to-SMILES transformers as single-step retrosynthesis models. Our approach reduces the latency of SMILES-to-SMILES transformers powering multi-step synthesis planning in AiZynthFinder through speculative beam search combined with a scalable drafting strategy called Medusa. Replacing standard beam search with our approach allows the CASP system to solve 26\% to 86\% more molecules under the same time constraints of several seconds. Our method brings AI-based CASP systems closer to meeting the strict latency requirements of high-throughput synthesizability screening and improving general user experience.


TempRe: Template generation for single and direct multi-step retrosynthesis

Xuan-Vu, Nguyen, Armstrong, Daniel P, Jončev, Zlatko, Schwaller, Philippe

arXiv.org Artificial Intelligence

Retrosynthesis planning remains a central challenge in molecular discovery due to the vast and complex chemical reaction space. While traditional template-based methods offer tractability, they suffer from poor scalability and limited generalization, and template-free generative approaches risk generating invalid reactions. In this work, we propose TempRe, a generative framework that reformulates template-based approaches as sequence generation, enabling scalable, flexible, and chemically plausible retrosynthesis. We evaluated TempRe across single-step and multi-step retrosynthesis tasks, demonstrating its superiority over both template classification and SMILES-based generation methods. On the PaRoutes multi-step benchmark, TempRe achieves strong top-k route accuracy. Furthermore, we extend TempRe to direct multi-step synthesis route generation, providing a lightweight and efficient alternative to conventional single-step and search-based approaches. These results highlight the potential of template generative modeling as a powerful paradigm in computer-aided synthesis planning.


Chemical reasoning in LLMs unlocks steerable synthesis planning and reaction mechanism elucidation

Bran, Andres M, Neukomm, Theo A, Armstrong, Daniel P, Jončev, Zlatko, Schwaller, Philippe

arXiv.org Artificial Intelligence

While machine learning algorithms have been shown to excel at specific chemical tasks, they have struggled to capture the strategic thinking that characterizes expert chemical reasoning, limiting their widespread adoption. Here we demonstrate that large language models (LLMs) can serve as powerful chemical reasoning engines when integrated with traditional search algorithms, enabling a new approach to computer-aided chemistry that mirrors human expert thinking. Rather than using LLMs to directly manipulate chemical structures, we leverage their ability to evaluate chemical strategies and guide search algorithms toward chemically meaningful solutions. We demonstrate this paradigm through two fundamental challenges: strategy-aware retrosynthetic planning and mechanism elucidation. In retrosynthetic planning, our method allows chemists to specify desired synthetic strategies in natural language to find routes that satisfy these constraints in vast searches. In mechanism elucidation, LLMs guide the search for plausible reaction mechanisms by combining chemical principles with systematic exploration. Our approach shows strong performance across diverse chemical tasks, with larger models demonstrating increasingly sophisticated chemical reasoning. Our approach establishes a new paradigm for computer-aided chemistry that combines the strategic understanding of LLMs with the precision of traditional chemical tools, opening possibilities for more intuitive and powerful chemical reasoning systems.


Reviews: Depth-First Proof-Number Search with Heuristic Edge Cost and Application to Chemical Synthesis Planning

Neural Information Processing Systems

Originality: PNS and related algorithms have not been evaluated for synthesis planning since work by Heifets and others several years ago. Revisiting this class of algorithms and proposing modifications to improve performance in multi-step synthesis planning is nice to see. Quality: The empirical evaluation is not as strong as it could be, but the conceptual contribution of this work is still important for the problem of synthesis planning. Clarity: The description of algorithms in 254-266 and elsewhere is not complete enough to reimplement the models and baselines. The dataset split, details of template extraction, network training, etc. is not provided either and the code is not available. Significance: The novelty of the modifications to the algorithm may be minor, but evaluating it in the context of this problem is important.


ASKCOS: an open source software suite for synthesis planning

Tu, Zhengkai, Choure, Sourabh J., Fong, Mun Hong, Roh, Jihye, Levin, Itai, Yu, Kevin, Joung, Joonyoung F., Morgan, Nathan, Li, Shih-Cheng, Sun, Xiaoqi, Lin, Huiqian, Murnin, Mark, Liles, Jordan P., Struble, Thomas J., Fortunato, Michael E., Liu, Mengjie, Green, William H., Jensen, Klavs F., Coley, Connor W.

arXiv.org Artificial Intelligence

The advancement of machine learning and the availability of large-scale reaction datasets have accelerated the development of data-driven models for computer-aided synthesis planning (CASP) in the past decade. Here, we detail the newest version of ASKCOS, an open source software suite for synthesis planning that makes available several research advances in a freely available, practical tool. Four one-step retrosynthesis models form the basis of both interactive planning and automatic planning modes. Retrosynthetic planning is complemented by other modules for feasibility assessment and pathway evaluation, including reaction condition recommendation, reaction outcome prediction, and auxiliary capabilities such as solubility prediction and quantum mechanical descriptor prediction. ASKCOS has assisted hundreds of medicinal, synthetic, and process chemists in their day-to-day tasks, complementing expert decision making. It is our belief that CASP tools like ASKCOS are an important part of modern chemistry research, and that they offer ever-increasing utility and accessibility.


Tango*: Constrained synthesis planning using chemically informed value functions

Armstrong, Daniel, Joncev, Zlatko, Guo, Jeff, Schwaller, Philippe

arXiv.org Artificial Intelligence

Computer-aided synthesis planning (CASP) has made significant strides in generating retrosynthetic pathways for simple molecules in a non-constrained fashion. Recent work introduces a specialised bidirectional search algorithm with forward and retro expansion to address the starting material-constrained synthesis problem, allowing CASP systems to provide synthesis pathways from specified starting materials, such as waste products or renewable feed-stocks. In this work, we introduce a simple guided search which allows solving the starting material-constrained synthesis planning problem using an existing, uni-directional search algorithm, Retro*. We show that by optimising a single hyperparameter, Tango* outperforms existing methods in terms of efficiency and solve rate. We find the Tango* cost function catalyses strong improvements for the bidirectional DESP methods. Our method also achieves lower wall clock times while proposing synthetic routes of similar length, a common metric for route quality.